# A 6-bit Segmented DAC Architecture With up to 56-GHz Sampling Clock and 6-V<sub>pp</sub> Differential Swing

Andreea Balteanu, Student Member, IEEE, Peter Schvan, Member, IEEE, and Sorin P. Voinigescu, Senior Member, IEEE

Abstract—A distributed power digital-to-analog converter (DAC) architecture with multi-level segmentation is proposed. Due to its large output voltage swing, it can be used as a large swing arbitrary waveform generator suitable for a variety of wireline, fiber optic, and instrumentation applications. A proof-of-concept 56-GS/s 6-bit implementation with most significant bits (MSBs) and least significant bits (LSBs) segmentation and full-rate clock was manufactured in a production 130-nm SiGe BiCMOS technology. The circuit features 14 independent data bits-seven for the three MSBs and seven for the three LSBs-each running at up to at least 44 Gb/s. The measured saturated output power and bandwidth are 17 dBm and 45 GHz, respectively. An output swing of 3.4  $V_{\rm pp}$  per side is observed in 50- $\Omega$  loads. Spectral measurements demonstrate multi-bit modulation at carrier frequencies as high as 56 GHz. To the best of our knowledge, this marks the highest output-bandwidth highest voltage-swing current-steering DAC in silicon.

*Index Terms*—Current steering, digital-to-analog converter (DAC), distributed amplifiers (DAs), power DAC, segmented DAC, SiGe BiCMOS.

## I. INTRODUCTION

T IS expected that global IP traffic will reach zettabytes levels by 2017, with an annual growth rate of over 20% [1]. To satisfy the market demands and make optimal use of the available channel bandwidth, future optical transceivers must allow for adjustable data rates and modulation formats, including multi-carrier orthogonal frequency division multiplexing (OFDM). Such software-defined optical transceivers have recently been proposed [2]–[4] in order to optimize network capacity by dynamically adapting to account for channel impairments and transmitter and receiver nonidealities.

Today's most advanced 110-Gb/s optical transmitters consist of large digital signal processor (DSP) engines feeding high-

A. Balteanu and S. P. Voinigescu are with the Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada M5S 3G4 (e-mail: balteanu1@eecg.toronto.edu).

P. Schvan is with the Ciena Corporation, Ottawa, ON, Canada K2K 3C8.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TMTT.2016.2525825



Fig. 1. Traditional optical transmitter consisting of a DSP, DAC, and linear amplifier driving a Mach–Zehnder modulator and the proposed implementation that utilizes the large swing DAC (dashed box) discussed in this paper.

speed (HS) (56–60) GS/s nonreturn-to-zero (NRZ) digital-toanalog converters (DACs) [5]–[7], as shown in Fig. 1. Typically, the relatively small (< 1 V<sub>pp</sub> differential) DAC output signal is boosted by a large-swing linear driver that can provide the high (> 5 V<sub>pp</sub>) voltages required by state of the art Mach–Zehnder modulators [8], [9] to achieve adequate extinction ratios. Although 1–2-V<sub>pp</sub> drive CMOS-based photonics modulators have been reported recently [10]–[13], they operate below 30 GBd. Higher voltage drive (4–6 V<sub>pp</sub>) is needed by the fastest silicon photonics modulators operating at or above 50 GBd [14].

The linear driver must operate with several dB of back-off in order not to distort the DAC signal, resulting in poor transmitter efficiency. Moreover, given the bandwidth and large voltage swing requirements, the driver is often implemented in a III–V technology [15], [16] leading to a multi-chip solution.

This work investigates possible solutions to integrate the DAC and linear amplifier into a saturated large swing power DAC, as illustrated in Fig. 1, capable of providing the large swing required to drive the optical modulator with the highest efficiency. Although the proposed power DAC architecture is technology agnostic, when implemented in 55-nm SiGe BiCMOS or sub-40-nm silicon-on-insulator (SOI) CMOS technologies, integration of the DSP also becomes feasible. With over 50-GHz output bandwidth, such a single-chip fully digital large swing transmitter solution becomes very attractive

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Manuscript received April 09, 2015; revised August 26, 2015, November 29, 2015, and December 26, 2015; accepted December 29, 2015. Date of publication February 23, 2016; date of current version March 03, 2016. This work was supported by the Ciena Corporation. This paper is an expanded paper from the IEEE MTT-S International Microwave Symposium, Montreal, QC, Canada, June 17–22, 2012.



Fig. 2. Example of: (a) single-sideband optical transmitter based on a tuned power DAC and (b) example of a quadrature modulator for instrumentation applications.

for future 400-Gb/s and 1-Tb/s systems because it would avoid transferring terabytes of data between two chips.

The proposed power DAC architecture allows for adjustable data rates and clock frequencies and can be applied to NRZ,  $2 \times$  time interleaved [17] and tuned (i.e., carrier envelope modulation) power DACs. As will be shown in this paper, the latter can be used in single-sideband optical transmitters and in instrumentation, where complex modulated 1–60-GHz carriers can be synthesized, as needed to test fifth-generation (5G) millimeter-wave radio receivers operating at data rates of tens of gigabit per second. Examples of both proposed implementations are shown in Fig. 2.

The first version of this DAC architecture was presented in [19]. The input driver, the clock distribution path, and the output binary to phase-shift keying (BPSK) stages, have been modified to extend the bandwidth. Medium voltage (MV) devices in a common base configuration were added to the output stage, which increased the maximum output swing from 4  $V_{pp}$  in [19] to 6.4  $V_{pp}$  differential.

The proposed DAC architecture is introduced in Section II, which is followed by a detailed description of the design considerations and the transistor level implementation in Sections III and IV. The production 130-nm SiGe BiCMOS technology used to manufacture the chip is reviewed in Section V, while the experimental results are summarized in Section VI.

## II. PROPOSED DAC ARCHITECTURE

Traditional HS DACs feature a lumped output stage, which typically limits the output voltage swing to less than  $1-V_{pp}$  differential and/or the output bandwidth to less than 30 GHz [5], [6]. For larger output bandwidths, analog mixing and filtering of lower bandwidth DACs has been suggested [19]. Alternatively, as proposed in [18], an entirely distributed DAC architecture can



Fig. 3. Schematic of the distributed seven-stage DAC.



Fig. 4. Three possible topologies for the output stage in each cell: (a) current steering for NRZ DAC, (b) multiplexer for time-interleaved DAC, and (c) BPSK modulator for clocked envelope DAC.

be used to simultaneously extend both the output bandwidth and the output voltage swing of the DAC. Recently, a distributed output transmission line was employed as the output summer in a time-interleaved 100-GS/s DAC with 400-mV<sub>pp</sub> differential output swing manufactured in 28-nm LP CMOS [17]. However, most likely because of the thin dielectric and thin metal back-end, the output bandwidth remained below 20 GHz [17].

If the distributed DAC consists of 2i - 1 identical cells, as illustrated in Fig. 3, full segmentation of the *i* most significant bits (MSBs) is achieved by default. To minimize glitch energy when the MSB's switch, *i* should be as large as possible, ideally equal to n, allowing for full segmentation of the n-bit DAC. In practical cases, i is limited to 2–3. If i = 4, the DAC will have 15 cells, which makes it impractical for integration in silicon due to the large die size and prohibitive loss along the output transmission line [20]. To further increase the bit resolution of the DAC to n = 2i, each DAC cell can be realized as two current steering stages [see Fig. 4(a)] in a 2i : 1 ratio, each driven by independent data streams and connected in parallel at the output nodes. Again, full segmentation of the *i* least significant bits (LSBs) is a default benefit of the distributed topology. For example, if n = 6 and i = 3, the resulting 6-bit distributed DAC will have seven cells and each cell will feature two current steering stages with a transistor and tail current ratio of 8:1. The three MSBs and three LSBs will be segmented and 14 independent data paths will be required, all synchronized by the same full rate sampling clock applied at the input of the distributed DAC. Even higher resolution, for example, 9 bits, could be achieved, if a third properly scaled current steering output stage (in a 1:64 ratio with the MSB stage) would be added to each DAC cell.

Since the distributed DAC cells operate in switching mode, as long as the clock signal is large enough to saturate the current steering stages in all DAC cells, the frequency response of the input transmission line does not need to be constant as a function of frequency. Only the frequency response of the output transmission line needs to be flat and with minimal group-delay variation. It should be noted that each DAC cell can operate equally well in linear mode and in saturated power mode, with maximum efficiency, as in an HS current-steering output driver. Depending on the topology of the current-steering stages in each DAC cell, as shown in Fig. 4, NRZ, NRZ time-interleaved [21], and broadband saturated power DACs for carrier envelope modulation [18] can be implemented.

Unlike the first two, the circuit in Fig. 4(c) has the added benefit that the output dc level and the output impedance remain constant, irrespective of the digital word. In all cases the input transmission line is used to distribute the clock to each cell, while the output transmission line acts as broadband summer for the output currents of the current-steering cells and also ensures that output matching is achieved over the broadest possible frequency range.

## III. TRANSMISSION-LINE DESIGN FOR THE DISTRIBUTED DAC

As in a traditional distributed amplifier (DA), the cutoff frequency  $(f_c)$  of the artificial transmission line is given by

$$f_c = \min\left(\frac{1}{\pi\sqrt{L_{\rm in}C_{\rm in}}}, \frac{1}{\pi\sqrt{L_{\rm out}C_{\rm out}}}\right) \tag{1}$$

where  $C_{in}$  and  $C_{out}$  are the equivalent input and output capacitances of each cell, and  $L_{in}$  and  $L_{out}$  are the equivalent inductances of the input and output transmission-line sections of length l. For a given transistor technology and output stage topology of the DAC cell, the minimum value for  $C_{out}$  is determined by the size of the output device used, which, in turn, is determined by the output voltage swing. The maximum value for  $L_{out}$  will be set by the bandwidth requirement and by the characteristic impedance of the transmission line, given by

$$Z_o = \sqrt{\frac{L_{\rm in}}{C_{\rm in}}} = \sqrt{\frac{L_{\rm out}}{C_{\rm out}}} \tag{2}$$

and is typically 50  $\Omega$  for both lines, although the input line impedance could be different. As noted in [22], each segment of transmission line has an inherent delay ( $t_d$ ) associated with it equal to

$$t_d = \sqrt{L_{\rm in}C_{\rm in}} = \sqrt{L_{\rm out}C_{\rm out}}.$$
 (3)

For example, if  $C_{out} = 100$  fF and  $Z_0 = 50 \Omega$ , then  $L_{out} = 250 \text{ pH}$ ,  $f_c = 71 \text{ GHz}$  and  $t_d = 4.46 \text{ ps}$ . Therefore, in order for the signals from each cell to be summed together correctly at the DAC output, the data paths for MSB<sub>n</sub> and LSB<sub>n</sub> associated with the *n*th cell must be delayed by  $t_d$  with respect to those of the preceding stage. This is typically implemented in the DSP engine. However, the presence of the retiming flip-flops in the data paths of each cell can relax this requirement. A delay mismatch between the data lanes of up to half the clock period can be corrected by the retiming function of the flip-flops.

As already mentioned, the frequency-dependent loss along the output transmission line is an inherent limitation of the distributed power DAC architecture and it becomes more severe with increasing frequency. In a traditional DA, which operates in linear mode, the sum of the input and output transmission line losses encountered by the signal as it is distributed to each cell and collected at the output load is identical for each cell in the amplifier. The same applies for a distributed DAC operating in



Fig. 5. Proposed structure for the DAC cell.

linear mode. However, since the DAC is designed to operate in limiting mode, the output current from cells closer to the input will reach the output load with higher attenuation than the currents from those cells closer to the output. For example, in the seven-cell DAC discussed earlier, the contribution to the output voltage swing due to the current of the MSB associated with the first DAC cell will be attenuated by  $6 \times l \times \alpha(f)$  dB (where  $\alpha$ (f) is the loss of the output line in dB/mm) compared to that of the MSB associated with the seventh DAC cell. This systematic integral nonlinearity (INL) and differential nonlinearity (DNL) error increases with the clock (carrier) frequency and can only be compensated from the LSBs. The backend of the technology will also play a significant role in reducing it. For example, if the output transmission-line loss is 0.1 dB/mm at 10 GHz and 0.4 dB/mm at 40 GHz, and the output transmission-line section length is 0.5 mm, the difference between the first cell and last cell MSBs is 0.3 and 1.2 dB, at 10 and 40 GHz, respectively. If not compensated with LSBs, the latter reduces the effective resolution of the saturated power DAC at 40 GHz to less than 3 bits.

The following sections will focus on the implementation of a broadband power DAC for carrier envelope modulation.

### **IV. TRANSISTOR-LEVEL IMPLEMENTATION**

In order to achieve 3  $V_{pp}$  per side in matched on-chip and off-chip 50- $\Omega$  loads, the total output current must be at least 120 mA. Therefore, each of the seven DAC cells must provide a total output current of at least 17.2 mA. At the output of each 2-bit DAC cell shown in Fig. 5 are two direct binary phase-shift keying (BPSK) modulators, one for the MSB and one for the LSB. They are 8:1 scaled versions of each other, each driven by a retimed data path, which ensures synchronicity of all the data paths, thus minimizing glitches during data transitions. The tail current of the MSB output stage was set to 16 mA and that of the LSB stage to 2 mA, allowing each DAC cell to provide 18 mA to the output transmission line for a total of 126 mA.

Once the data are retimed by the flip-flop, they must pass through two limiting differential inverters, which amplify the signal while simultaneously minimizing the clock feedthrough to the BPSK modulator.

Since the input clock signal has 50–60-GHz bandwidth and may be provided to the chip in single-ended fashion, a lumped



Fig. 6. Schematic of the BPSK modulator. MOSFET gate widths and HBT emitter lengths are provided in the inset. All HBTs have 0.13- $\mu$ m emitter width and, unless otherwise stated, all MOSFETs have minimum gate length of 0.13  $\mu$ m.

preamplifier with very good common-mode rejection up to at least 50 GHz must be placed at the input of the DAC.

Although the proposed DAC cell can be ported to various silicon and III–V technologies, the following sections will detail its implementation in a production 0.13- $\mu$ m SiGe BiCMOS technology whose characteristics will be discussed in Section V. More recent work on DAs [23] manufactured in a prototype 55-nm SiGe BiCMOS technology [24] indicates that the proposed DAC architecture is scalable in future technology nodes and to systems operating at over 100 GBd with aggregate data rates exceeding 1 Tb/s per carrier [25].

It is important to note that most of the bias current in this architecture is consumed on the retiming data path and less than 13% of the current per DAC cell contributes to the output swing. Moving to a more advanced technology node such as the 55-nm SiGe BiCMOS technology [24] would allow the retiming data path to operate at a larger fan-out for the same data rate, thus reducing the power consumption. The output power of the DAC output stage remains the same for the same output swing, but at a given data rate, the proportion of power consumption due to the digital lanes would be considerably reduced.

## A. BPSK Output Stages

The BPSK modulator, Fig. 6, is implemented using a doublebalanced Gilbert cell, which maintains a constant dc level at each output of the cell and constant output impedance regardless of the digital word. Gilbert-cell topologies have been traditionally used in lumped and distributed variable gain amplifiers and equalizers [26]. The clock/carrier, which is the highest frequency signal in the DAC, is applied at the gates of the MOS transconductors, while the data bits are applied at to the HBT Gilbert-cell quad. The fundamental frequency of the data path signals is at most half of the carrier/clock frequency, therefore it is advantageous for the carrier/clock signal to drive the smaller load offered by one MOS gate versus two HBT bases.

In order to achieve the highest possible bandwidth, the transistor sizes are chosen such that the drain/collector current density reaches the peak- $f_T$  current density ( $J_{pfT}$ ) for MOSFETs,



Fig. 7. Inverter and EF composite stage used to drive the BPSK modulators. All HBTs have 0.13- $\mu$ m emitter width and, unless otherwise stated, all MOSFETs have minimum gate length of 0.13  $\mu$ m.

and  $1.5 \times J_{pfT}$  for HBTs, respectively, when all the tail current is switched to one side [27]. For example, when Clk<sub>p</sub> is high, the entire tail current flows through M<sub>1</sub>, biasing it at 0.33 mA/ $\mu$ m, close to its  $J_{pfT}$ \_MOS [27]. If Data<sub>p</sub> is also high, the tail current will then flow through  $Q_2$  and  $Q_6$  biasing them at 20 mA/ $\mu$ m<sup>2</sup> (or 2 mA/ $\mu$ m of emitter length), which corresponds to 1.5 times the measured  $J_{pfT}$  of the HBT [29].

An improvement over the implementation in [18] is the addition of common base transistors  $Q_5$  and  $Q_6$ , which allow for larger output swing while also providing higher output impedance and higher spurious-free dynamic range (SFDR). All HBTs are low-breakdown voltage (BV<sub>CEO</sub> = 1.6 V) HS devices. The use of MV HBTs for  $Q_5$  and  $Q_6$  was also investigated in order to further increase the output swing as in [30]. However, it was found that while they allowed for a higher output swing, the DAC bandwidth decreased by more than 33%. This is due to their lower  $f_T$  of only 150 GHz and their approximately three times smaller peak- $f_T$  current density [29]. The latter leads to increased device size, higher output capacitance, and, therefore, lower bandwidth.

In order to ensure a precise process- and strain-independent 1:8 ratio between the tail currents of the MSB and LSB modulators, 8  $M_3-M_4$  units were connected in parallel in the MSB cell and their gate length was set to 0.18  $\mu$ m, larger than the 0.13- $\mu$ m minimum, in order to increase their output impedance. While a larger gate length for  $M_3$  and  $M_4$  would improve matching, this would require larger MOSFET devices, which would increase its output capacitance. The 500-fF capacitors are used for bias decoupling and formed with polysilicon capacitors because the preferred metal-insulator-metal (MIM) capacitors could not satisfy layer density rules in a tightly packed area of the layout.

## B. Clock Amplifier

In order to reduce timing mismatches, both BPSK modulators are driven by the same clock signal provided by the cascaded differential inverter/emitter–follower (INV/EF) stages shown in Fig. 7. A minimum signal amplitude of 500 mV<sub>pp</sub> per side is needed to fully switch the MOSFETs in the BPSK modulator.

The emitter–follower (EF) stage is important for providing the proper dc level to the BPSK modulator. However, since its voltage gain is only 0.6 V/V at 60 GHz, the preceding inverter



Fig. 8. MOS-HBT cascode stage used at the input of each DAC stage. All HBTs have 0.13- $\mu$ m emitter width and, unless otherwise stated, all MOSFETs have minimum gate length of 0.13  $\mu$ m.

is designed to have over 900-mV output logic swing to ensure a minimum of 500-mV swing at the gates of M<sub>1</sub> and M<sub>2</sub>. Resistors were used instead of current sources in the EF stage in order to reduce the parasitic capacitance seen at the output node and prevent oscillation. The 20-pH inductor represents parasitic layout interconnects and must be accurately modeled. As mentioned previously, the HBTs were biased at  $0.75 \times J_{pfT}$ when in equilibrium, and sized accordingly. All stages employ shunt-peaking inductors for bandwidth extension.

The MOS-HBT cascode buffer illustrated in Fig. 8 was placed at the input of each DAC cell. When compared to an HBT cascode with the same tail current, it offers a higher Q input impedance with a smaller input capacitance, which results in higher 3-dB frequency and lower loss for the input transmission line. Additionally, the MOS-HBT differential cascode offers better stability than an HBT-only cascode [27].

Double EFs were placed after the BiCMOS cascode to further increase the bandwidth as well as provide the proper dc level shifting for the subsequent stage.

# C. Data Path

The inputs of the MSB and LSB data paths consist of two cascaded differential amplifiers, which provide single-ended to differential conversion and input matching. The schematic is shown in Fig. 9. In order to provide broadband  $50-\Omega$  matching,  $50-\Omega$  resistors in series with 70-pH inductors were placed at the input. The logic swing was set to 300 mV<sub>pp</sub> per side. In order to reduce the number of pads, only one side of the buffer is driven externally while the other is left floating. Each data lane has several cascaded buffers at its input that provide sufficient common-mode rejection, even at 56 Gb/s, such that a balanced differential signal reaches the retiming flip-flop in each lane.

The differential inverters are followed by the retiming flipflop, which uses the BiCMOS topology shown in Fig. 10. The LSB flipflop employs two 2-mA latches while the MSB one uses a scaled 6-mA latch followed by an 8-mA latch. The size scaling was needed to limit the fanout and maximize bandwidth of the MSB path, given the large load provided by the 16-mA BSPK modulator.



Fig. 9. Schematic of the  $50-\Omega$  buffer used at the input of the data path. Only one side of the buffer is driven externally while the other is left floating.



Fig. 10. Retiming latch schematic with the associated device sizes. All HBTs have of 0.13- $\mu$ m emitter width.

Two buffers were inserted after the flip-flops to eliminate clock feedthrough, as in [6], and to amplify the signal to 480 mV<sub>pp</sub> per side at the data input of the BPKS modulator. A common mode resistor to  $V_{DD}$  is used to reduce the output common mode level to 2 V and avoid breakdown of the MOS transistors in the modulator. The delay between the two LSB and MSB data paths in each cell was carefully monitored in simulations during the design phase to ensure that two synchronized data signals arrive at the output of the BSPK cells. The synchronization of the two data paths can be further improved with the addition of programmable delay elements.



Fig. 11. Schematic of the first two sages of the lumped input amplifier. All HBTs have 0.13- $\mu$ m emitter width and all MOSFETs have minimum gate length of 0.13  $\mu$ m.

## D. Lumped Input Amplifier

The distributed section of the DAC is preceded by a three-stage lumped amplifier whose role is to amplify the off-chip clock/carrier signal, provide input impedance matching, and single-ended to differential conversion. The schematic of the first two stages is reproduced in Fig. 11. The first stage employs an HBT cascode and double EFs for bandwidth extension [28], while the second stage uses a MOS-HBT cascode topology and double EFs. A common-mode resistor is introduced in series with the tail current source in order to improve its output resistance and increase the common mode rejection ration (CMRR) at high frequency. The output capacitance of the current tail shunts its output resistance, thus reducing the CMRR as the frequency increases, and can cause oscillations in common mode. The series resistor helps suppress the oscillation and improves the CMRR at millimeter-wave frequencies. However, the resistor introduces an additional dc drop, as well as its own capacitance, thus limiting the size of resistance value that can be used.

The third amplifier stage in the chain is reproduced, as shown in Fig. 12. Once again, a composite INV-EF topology is employed. Normally, the output signal would be taken directly at the emitter of the HBT. However, in order to present a  $50-\Omega$ impedance to the DA, a series resistor was introduced in series with the output. The first two stages of the lumped amplifier are provide sufficient CMRR, thus a resistor in series with the tail current source is no longer needed in the third stage, which helps save voltage headroom.

## V. TECHNOLOGY AND FABRICATION

The circuit was manufactured in a production 130-nm SiGe BiCMOS process with SiGe HBT  $f_T/f_{MAX}$  of 240/270 GHz, six metal layers, MIM capacitors, and polysilicon resistors. The top two copper layers are 3  $\mu$ m thick with 7-m $\Omega/\Box$  sheet resistance [29]. A detailed diagram of the back-end cross section can be seen in [31]. The microphotograph of the 3.1 × 1.8 mm<sup>2</sup> die is shown in Fig. 13. A more detailed micrograph of one of the DAC stages is shown in Fig. 14. When operating at peak output power, the DAC consumes 1.58 A from 2.5 V, 152 mA from 3.3 V, and 126 mA from 5.7-V supplies for a total of 5.2 W.

Several considerations were taken into account when designing the transmission line used in the distributed section of



Fig. 12. Schematics of the INV/EF pair that drives the DA. All HBTs have 0.13- $\mu$ m emitter width and, unless otherwise stated, all MOSFETs have minimum gate length of 0.13  $\mu$ m.



Fig. 13. Micrograph of the  $3.1 \times 1.8 \text{ mm}^2$  DAC chip fabricated in a 0.13- $\mu$ m SiGe BiCMOS technology.



Fig. 14. Micrograph of an individual DAC stage.

the DAC. The 4- $\mu$ m minimum width of the transmission line was imposed by the electron migration rules. Similarly, the minimum length of the transmission-line section used between cells was dictated by the physical length of the layout of the DAC cell. In order to decrease the transmission-line segment length and its associated inductance, the first two inverters in the data paths were place on the outside of the input and output transmission lines, as illustrated in Fig. 15. The corresponding minimum transmission-line segment length is 660  $\mu$ m. Placing



Fig. 15. Possible implementations of inter-stage transmission line layout.



Fig. 16. Depiction of two adjacent transmission lines.

the entire data path on the outside of the transmission lines, and the BPSK modulators on the inside, would have resulted in a smaller minimum segment length of 350  $\mu$ m. However, this would increase the physical distance between the retiming flip-flops and their respective clock amplifiers, which would require an increase in the tail current and power consumption of the clock amplifier stages to compensate for the longer interconnect and its associated loss. The 350- $\mu$ m segment would indicate less coupling between the adjacent input and output transmission lines due to the shorter distance they travel together. However, in order to distribute the clock signal to the flip-flops, the shorter line would have to cross the clock signal one additional time versus the longer line, as well as cross the retimed data, which would result in additional coupling, making the 660- $\mu$ m line preferable.

The inter-stage transmission lines were implemented using the top 6th metal (M6) over a ground plane formed using the third metal layer (M3), which is 7.5  $\mu$ m below. Power supply, biasing currents, and control lines were passed between grounded M1 and M3 shields, using the second metal layer. A cross section is shown in Fig. 16. The center-to-center distance between the adjacent input and output transmission lines is 40  $\mu$ m and includes a 5- $\mu$ m-wide section in which metals 3–6 are shunted to ground to improve isolation.

The 660- $\mu$ m-long 5- $\mu$ m-wide transmission-line segment used between stages has a simulated loss of 0.35 dB at 70 GHz and a characteristic impedance of 69.3  $\Omega$ . The simulated attenuation constant ( $\alpha$ ) in Fig. 17 is in perfect agreement with measurements reported in [32] for this technology.

The simulated isolation between physically adjacent transmission lines is shown in Fig. 18 and is better than 10 dB up to 60 GHz. The isolation puts an upper limit on the gain each cell in order to avoid oscillations. While the transmission lines are the main source of coupling between the input and output, layout parasitics provide additional feedback paths. As a rule of thumb, the isolation should be at least 10 dB higher than the



Fig. 17. Simulated attenuation constant of a 5- $\mu$ m-wide transmission line formed of Metal 6 over a Metal 3 ground plane.



Fig. 18. Isolation between physically adjacent transmission lines.



Fig. 19.  $S_{21}$  measurements for all  $2^{14}$  states as a function of DAC output frequency.

gain from the input to the output to avoid oscillations. However, measurements of the DAC  $S_{12}$  show that these isolations simulations are much too pessimistic, as is discussed in Section VI.

#### VI. EXPERIMENTAL RESULTS

All measurements were conducted on wafer in a 50- $\Omega$  environment. The setup, consisting of an Agilent N5227 PNA, 1-mm cables, and dc-110 GHz probes, was calibrated down to the probe tips using line–reflect–reflect–match (LRRM) calibration on an impedance standard substrate (ISS). S-parameter measurements, performed over all 2<sup>14</sup> states, show a peak gain of 31.8 dB and a unity gain bandwidth larger than 60 GHz, as illustrated in Fig. 19. Port 1 is defined as the clock input, while port 2 is the DAC output.



Fig. 20. Measured small-signal clock input return loss  $(\mathrm{S}_{11})$  for all  $2^{14}-1$  digital words.



Fig. 21. Measured small-signal DAC output return loss (S<sub>22</sub>) over all  $2^{14} - 1$  digital words.

The input (S<sub>11</sub>) and output (S<sub>22</sub>) return loss are plotted in Figs. 20 and 21, respectively. Both are better than -10 dB up to 50 GHz. The lumped input amplifier provides sufficient isolation to make the input matching insensitive to the digital word settings. Similarly, only a small variation in output matching (S<sub>22</sub>) is observed over bit settings, confirming that the output impedance of each stage does not change with the digital word. The measured DAC S<sub>12</sub> is better than -44 dB up to 60 GHz. It appears that the simulated coupling between the transmission lines shown in Fig. 18 is too pessimistic.

Since the DAC is meant to operate in saturated mode, a better measure of its performance is the output power ( $P_{out}$ ). Saturated output power measurements performed over all 2<sup>14</sup> states, show in Fig. 22 that the DAC can produce 14.5 dBm per side (17.5-dBm differential) up to 40 GHz. This corresponds to 3.36 V<sub>pp</sub> swing per side, for a 50- $\Omega$  load.

The dynamic range was calculated from both small- and large-signal measurements and plotted in Fig. 23. The theoretical large-signal dynamic range, in this case defined as the largest carrier power that could be transmitted to the leakage power of the clock/carrier signal when the digital word is set to 000000, is  $20\log_{10} (63/1) = -36$  dB. The measurements deviate from this theoretical value due to several factors such as coupling between the input and output transmission line, mismatches in the tail currents of the BPSK stages and frequency-dependent loss in the output transmission line. At high



Fig. 22. Measured saturated output power of the DAC for all 2<sup>14</sup> states as a function of DAC output frequency.



Fig. 23. Measured static dynamic range using both small- and large-signal measurements.



Fig. 24. Measured static DNL at various clock frequencies.

frequencies, the small-signal dynamic range becomes inaccurate due to the limited PNA sensitivity coupled with the low values of  $S_{21}$  of the clock path and decreased  $|S_{22}|$  matching.

The segmentation of the LSB and MSB allows for 2<sup>14</sup> possible states to choose from to provide the best possible 2<sup>6</sup> digital words. The best combination of segments for each digital word was used to "calibrate" the DAC to obtain the best performance for each carrier/clock frequency. The measured DNL and INL at different carrier frequencies are shown in Figs. 24 and 25, respectively.

Large-signal measurements of the DAC were performed using an Anritsu MP1800A signal quality analyzer with an MP18121A 56-Gb/s MUX to create two high-speed PRBS7 data bits. Another two PRBS7 signals were obtained using a Centellax OTB1P1A board. Two MSBs and two LSBs were



Fig. 25. Measured static INL at various clock frequencies.



Fig. 26. Single-ended 44-Gb/s output eye diagram (bottom) with 2 MSBs switching at 44 Gb/s. Cable and probe losses were not de-embedded.



Fig. 27. Measured 45-GHz carrier (*bottom*) modulated over nine amplitude levels, sufficient for 128-QAM radio. Cable and probe losses were not de-embedded.

provided to the chip through HS probes at data rates up to 50 Gb/s, whereas the other five MSBs and five LSBs were controlled by lower speed (< 2.5 Gb/s) data streams. Since the two bit error rate testers (BERTs) and digital signal analyzer (DSA) could not be synchronized with the same low-noise millimeter-wave signal source, their drift causes additional jitter in the observed eye diagrams.

The output eye diagram is shown Fig. 26 when two MSBs are independently switching at 44 Gb/s The input signal to one of the MSBs is shown at the top, while the 2-bit 44-Gb/s output signal is shown at the bottom. The signal attenuation due to probe and cable losses is not de-embedded from the output.

In Figs. 27 and 28, all seven MSBs and some LSBs were programmed to switch at 100 Mb/s to synthesize 45- and 56-GHz



Fig. 28. Measured 56-GHz carrier (*bottom*) modulated over six amplitude levels, sufficient for 64-QAM radio. Cable and probe losses were not de-embedded.



Fig. 29. Measured spectra of a 45-GHz carrier with all seven MSBs switching at 1-Gb/s cable and probe losses were not de-embedded. The resolution bandwidth is 16 kHz.

 TABLE I

 PERFORMANCE COMPARISON OF THE LARGE-SWING DAC

|                                        | [5]           | [6]                  | [33]              | [17]          | This Work                |
|----------------------------------------|---------------|----------------------|-------------------|---------------|--------------------------|
| Technology                             | 65 nm<br>CMOS | 0.5 μm<br>InP<br>HBT | 0.7 um<br>InP HBT | 28 nm<br>CMOS | 130 nm<br>SiGe<br>BiCMOS |
| $f_T/f_{MAX}$                          | NA            | 290/320              | 300/270           | NA            | 240/270                  |
| Data Rata<br>(GS/s)                    | 56            | 60                   | 42                | 100           | 56                       |
| Sampling<br>Frequency<br>(GHz)         | 26.9          | 60                   | 42                | 25            | 56                       |
| Max. Output<br>(V <sub>pp</sub> diff.) | 0.6           | 1                    | 3.2               | 0.27          | 6.7                      |
| Resolution<br>(bits)                   | 6             | 6                    | 3                 | 8             | 6                        |
| Power (W)                              | 0.75          | 1.8                  | 2                 | 2.5           | 5.2                      |
| Area (mm <sup>2</sup> )                | 0.6×0.4       | 3×3                  | 1.5×1.2           | 1.6×0.9       | 3.1×1.8                  |

carriers modulated in amplitude showing more than 3 bits of effective resolution. In Fig. 29, the spectrum of a 45-GHz carrier with seven MSBs switching at 1 GHz is shown.

# VII. CONCLUSIONS

A novel large-swing distributed DAC-driver architecture was introduced suitable for next-generation software-defined fiberoptic systems with high-order quadrature amplitude modulation (QAM) and OFDM format. A proof-of-concept 6-bit implementation in a commercial 130-nm SiGe BiCMOS process has demonstrated 17.5-dBm output power (over  $6-V_{\rm pp}$  differential voltage swing in 50  $\Omega$ ) from dc to 40 GHz and over 13-dBm output power at 50 GHz. Operation at 44 GBd (limited by test equipment) and modulation of a 56-GHz carrier over six amplitude levels suggest that the DAC is suitable for 56-GBd systems with 64-QAM modulation. A comparison of the proposed large-swing DAC with other recently published work is summarized in Table I.

#### ACKNOWLEDGMENT

Equipment loans were generously provided by Prof. J. Poon, Anritsu Corporation, and Agilent Technologies. The EMX simulation software was provided by Integrand Software Inc. The authors would also like to thank J. Pristupa and CMC for CAD support.

## REFERENCES

- M. Karl and T. Herfet, "Transparent multi-hop protocol termination," in *IEEE 28th Int. Adv. Inf. Netw. Appl. Conf.*, 2014, pp. 253–259.
- [2] O. Rival, G. Villares, and A. Morea, "Impact of inter-channel nonlinearities on the planning of 25–100 Gb/s elastic optical networks," *J. Lightw. Technol.*, vol. 29, no. 9, pp. 1326–1334, Sep. 2011.
- [3] W. Wei, C. Wang, and J. Yu, "Cognitive optical networks: Key drivers, enabling techniques, and adaptive bandwidth services," *IEEE Commun. Mag.*, vol. 50, no. 1, pp. 106–113, Jan. 2012.
- [4] B. T. Teipen, H. Griesser, and M. H. Eiselt, "Flexible bandwidth and bit-rate programmability in future optical networks," in *14th Int. Transparent Opt. Netw. Conf.*, 2012, pp. 1–4.
- [5] Y. M. Greshishchev et al., "A 56 GS/S 6 b DAC in 65 nm CMOS with 256 × 6 b memory," in *IEEE Int. Solid-State Circuits Conf. Tech. Dig.*, 2011, pp. 194–196.
- [6] M. Nagatani, H. Nosaka, K. Sano, K. Murata, K. Kurishima, and M. Ida, "A 60-GS/s 6-bit DAC in 0.5-μm InP HBT technology for optical communications systems," in *IEEE Compound Semicond. Integr. Circuits Symp.*, 2011, pp. 1–4.
- [7] J. Godin et al., "InP DHBT very high speed power-DACs for spectrally efficient optical transmission systems," in *IEEE Compound Semicond. Integr. Circuits Symp.*, 2011, pp. 1–4.
- [8] E. Rouvalis *et al.*, "A low insertion loss and low Vπ InP IQ modulator for advanced modulation formats," in *Eur. Opt. Commun. Conf.*, 2014, pp. 1–3.
- [9] M. R. Watts, W. A. Zortman, D. C. Trotter, R. W. Young, and A. L. Lentine, "Low-voltage, compact, depletion-mode, silicon Mach–Zehnder modulator," *IEEE J. Sel. Top. Quantum Electron.*, vol. 16, no. 1, pp. 159–164, Jan.–Feb. 2010.
- [10] X. Wu et al., "A 20 Gb/s NRZ/PAM-4 1 V transmitter in 40 nm CMOS driving a Si-photonic modulator in 0.13 μm CMOS," in *IEEE Int. Solid-State Circuits Conf. Tech. Dig.*, 2013, pp. 128–129.
- [11] T. Kato, "InP modulators with linear accelerator like segmented electrode structure," in Opt. Fiber Commun. Conf. & Exhib., 2014, pp. 1–3.
- [12] N. Dupuis *et al.*, "30-Gb/s optical link combining heterogeneously integrated III–V/Si photonics with 32-nm CMOS Circuits," *J. Lightw. Technol.*, vol. 33, no. 3, pp. 657–662, Mar. 2015.
- [13] M. Traverso *et al.*, "25 GBaud PAM-4 error free transmission over both single mode fiber and multimode fiber in a QSFP form factor based on silicon photonics," in *Opt. Fiber Commun. Conf. Tech. Dig.*, 2015, Art. ID Th5B.3.
- [14] D. Mahgerefteh and C. Thompson, "Techno-economic comparison of silicon photonics and multimode VCSELs," in OSA Opt. Fiber Commun. Conf. Tech. Dig., 2015, Art. ID Th5B.3.
- [15] J. Dupuy et al., "A 6.2-Vpp 100-Gb/s selector-driver based on a differential distributed amplifier in 0.7-µm InP DHBT technology," in EEE MTT-S Int. Microw. Symp. Dig., 2012, pp. 1–3.

- [16] J.-Y. J. Godin *et al.*, "InP DHBT Mux-Drivers for very high symbol rate optical communications," in *IEEE Compound Semicond. Integr. Circuit Symp.*, 2014, pp. 128–132.
- [17] H. Huang, J. Heilmeyer, M. Grozing, and M. Berroth, "An 8-bit 100-GS/s distributed DAC in 28-nm CMOS," in *IEEE Radio Freq. Integr. Circuits Symp.*, 2014, pp. 65–68.
- [18] A. Balteanu, P. Schvan, and S. P. Voinigescu, "A 6-bit segmented RZ DAC architecture with up to 50-GHz sampling clock and 4 V<sub>pp</sub> differential swing," in *IEEE MTT-S Int. Microw. Symp. Dig.*, 2012, pp. 1–3.
- [19] C. Laperle, N. Ben-Hamida, and M. O'Sullivan, "Advances in high-speed DACs, ADCs, and DSP for software defined optical modems," in *IEEE Compound Semicond. Integr. Circuit Symp.*, 2013, pp. 1–4.
- [20] Y. Li, G. W. Ling, and Y.-Z. Xiong, "A up to 100 GHz broadband mixer with cascaded distributed amplifier," in *IEEE Compound Semicond. Integr. Circuits Symp.*, 2014, pp. 33–37.
- [21] M. Nagatani, H. Nosaka, S. Yamanaka, K. Sano, and K. Murata, "Ultrahigh-speed low-power DACs using InP HBTs for beyond-100-Gb/s/ch optical transmission systems," *IEEE J. Solid-State Circuits*, vol. 46, no. 10, pp. 2215–2225, Oct. 2011.
- [22] T. Y. K. Wong, A. P. Freundorfer, B. C. Beggs, and J. E. Sitch, "A 10 Gb/s AlGaAs/GaAs HBT high power fully differential limiting distributed amplifier for III–V Mach-Zehnder modulator," *IEEE J. Solid-State Circuits*, vol. 31, no. 10, pp. 1388–1393, Oct. 1996.
- [23] P. Hoffman, P. Schvan, A. Chevalier, A. Cathelin, and S. P. Voinigescu, "A low-noise, DC-135 GHz distributed amplifier for receiver applications," in *IEEE Compound Semicond. Integr. Circuit Symp.*, 2015, pp. 1–4.
- [24] P. Chevalier et al., "A 55 nm triple gate oxide 9 metal layers SiGe BiCMOS technology featuring 320 GHz fT/370 GHz fMAX HBT and high-Q millimeter-wave passives," in *IEEE Int. Electron Devices Meeting*, 2014, pp. 77–79.
- [25] R. Rios-Müller et al., "1-Terabit/s net data-rate transceiver based on single-carrier nyquist-shaped 124 GBaud PDM-32QAM," in Opt. Fiber Commun. Conf. Tech. Dig., 2015, Art. ID Th5B.1.
- [26] H. Wu *et al.*, "Integrated transversal equalizers in high-speed fiber-optic systems," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2131–2137, Dec. 2003.
- [27] T. O. Dickson and S. P. Voinigescu, "Low-power circuits for a 2.5-V, 10.7-to-86-Gb/s serial transmitter in 130-nm SiGe BiCMOS," *IEEE J. Solid-State Circuits*, vol. 42, no. 10, pp. 2077–2085, Oct. 2007.
- [28] S. Trotta et al., "An 84 GHz bandwidth and 20 dB gain broadband amplifier in SiGe bipolar technology," *IEEE J. Solid-State Circuits*, vol. 42, no. 10, pp. 2099–2106, Oct. 2007.
- [29] G. Avenier *et al.*, "0.13 μm SiGe BiCMOS technology fully dedicated to mm-wave applications," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2312–2321, Sep. 2009.
- [30] R. A. Aroca and S. P. Voinigescu, "A large swing, 40-Gb/s SiGe BiCMOS driver with adjustable pre-emphasis for data transmission over 75 Ω coaxial cable," *IEEE J. Solid-State Circuits*, vol. 43, no. 10, pp. 2177–2186, Oct. 2008.
- [31] E. Dacquay et al., "D -band total power radiometer performance optimization in an SiGe HBT technology," *IEEE Trans. Microw. Theory Techn.*, vol. 60, no. 3, pp. 813–826, Mar. 2012.
- [32] K. H. K. Yau, E. Dacquay, I. Sarkas, and S. P. Voinigescu, "Device and IC characterization above 100 GHz," *IEEE Microw. Mag.*, vol. 13, no. 1, pp. 30–54, Jan.–Feb. 2012.
- [33] A. Konczykowska et al., "42 GBd 3-bit power-DAC for optical communications with advanced modulation formats in InP DHBT," Electron. Lett., vol. 47, no. 6, pp. 389–390, 2011.



Andreea Balteanu (GSM'10–M'14) received the B.A.Sc. degree in electrical engineering from the University of Waterloo, Waterloo, ON, Canada, in 2007, the M.A.Sc degree from the University of Toronto, Toronto, ON, Canada, in 2010, and is currently working toward the Ph.D. degree at the University of Toronto.

She has previously held internship positions with the IBM T. J. Watson Research Center, Altera Corporation, and Texas Instruments Incorporated. Her research interests include the design of high-speed and design its second second

millimeter-wave integrated circuits.

Ms. Balteanu was the recipient of the 2012 IEEE Microwave Theory and Techniques Society (IEEE MTT-S) International Microwave Symposium (IMS) Best Student Paper Award.

Peter Schvan (M'89) received the Ph.D. degree in electronics from Carleton University, Ottawa, ON, Canada, in 1985.

After joining Nortel Networks, he was involved with the development of CMOS and BiCMOS technologies. Following that, he was involved in the design of high-speed circuits for fiber-optic and wireless communication using SiGe BiCMOS, InP, and CMOS technologies. He is currently the Director of analog design with the Ciena Corporation, Ottawa, ON, Canada, where he is responsible for the devel-

opment of high-speed amplifiers and A/D and D/A converters. He has authored or coauthored several publications.



**Sorin P. Voinigescu** (S'91–M'95–SM'02) received the Ph.D. degree in electrical and computer engineering from the University of Toronto, Toronto, ON, Canada, in 1994.

From 1994 to 2002, he was initially with Nortel Networks and then with Quake Technologies, Ottawa, ON, Canada, where he was responsible for projects concerned with high-frequency characterization and statistical scalable compact model development for Si, SiGe, and III–V devices. He also conducted research on wireless and optical fiber

building blocks and transceivers in these technologies. In 2002, he joined the University of Toronto, where he is currently a a Full Professor. His research and teaching interests focus on atomic-scale semiconductor devices and their application in integrated circuits at frequencies beyond 300 GHz. In 2008, and 2015 he spent sabbatical leaves with Fujitsu Laboratories of America, Sunnyvale, CA, USA, and with Device Research Laboratories, NTT, Atsugi, Japan, respectively, where he conducted research on technologies and circuits for 100+Gb/s millimeter-wave radio and 1-Tb/s fiber-optic systems.

Dr. Voinigescu is a Member of the ITRS RF/AMS Committee and of the Technical Program Committee (TPC), IEEE BCTM. From 2003 to 2013, he served on the TPC and ExCOM of the IEEE CSICS. He was the recipient of the Nortel President Award for Innovation in 1996. In 2013, he was recognized with the ITAC Lifetime Career Award for his contributions to the Canadian Semiconductor Industry